Localization of maximal entropy random walk 
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We define a new class of random walk processes which maximize entropy. This maximal entropy 
random walk is equivalent to generic random walk if it takes place on a regular lattice, but it is not 
if the underlying lattice is irregular. In particular, we consider a lattice with weak dilution. We 
show that the stationary probability of finding a particle performing maximal entropy random walk 
localizes in the largest nearly spherical region of the lattice which is free of defects. This localization 
phenomenon, which is purely classical in nature, is explained in terms of the Lifshitz states of a 
certain random operator. 



Since the seminal papers by Einstein [l| and Smolu- 
chowski [3] which formulated the theory of Brownian mo- 
tion and diffusive processes, a discrete-time realization 
of these processes, random walk (RW), has continuously 
attracted attention. Random walk has been discussed 
in thousands of scientific papers and textbooks in statis- 
tical physics, economics, biophysics, engineering, parti- 
cle physics, etc., and is still an active research area (see 
e.g. [3(]). Mathematically speaking, random walk is a 
Markov chain which describes the trajectory of a parti- 
cle (random walker) taking successive random steps. In 
the simplest variant, random walk on a lattice, at each 
time step the particle chooses at random one of the ad- 
jacent nodes and jumps to it. In the continuum limit, 
the probability density of finding the particle at a given 
position obeys the diffusion equation. When the lattice is 
regular, it is easy to show that all trajectories (sequences 
of nodes visited by the particle) of a given length be- 
tween two given points of the lattice are equiprobable, 
and thus have maximal entropy. Therefore, drawing an 
analogy with the path-integral formalism Q, trajectories 
are weighted only by their length, which plays the role of 
the action in the absence of potential energy. 

In this Letter we ask what changes if one takes the 
above statement as a definition of RW. In other words, 
we define random walk not by local hopping rules but 
by the requirement that RW trajectories maximize en- 
tropy. We shall see that, if the lattice is not regular, this 
new definition leads to a dramatic change in the behav- 
ior of RW. Let us summarize our main results. First, we 
define the maximal entropy random walk (MERW) and 
show that it indeed maximizes the entropy of trajecto- 
ries, in contrast to generic random walk (GRW), which 
has smaller entropy. Second, we discuss a surprising ef- 
fect of localization of MERW trajectories in the presence 
of weak disorder. This is a purely classical example of 
the Lifshitz phenomenon Some kind of localization 
has been observed before in RW on networks with a broad 
distribution of nodes degrees 0, but for MERW the effect 
is completely different in nature, since it can be triggered 
by any small amount of inhomogeneity. 

To begin, let us consider quite generally a particle hop- 



ping randomly from node to node on a given finite, con- 
nected graph. The graph is defined by a symmetric ad- 
jacency matrix A, with elements Aij — 1 if i and j are 
neighboring nodes and Aij — otherwise. The hopping is 
a local Markov process: the particle which arrives at some 
moment to node i will hop to a neighboring node j with 
probability P^ , independently of the past history. The 
elements of the transition matrix are Py = if A^ = 0, 
that is if nodes i, j are not linked, and for each i one has 

The main quantity of interest is the probability, Hi (t) , 
of finding the particle at node i at time t. One can cal- 
culate it recursively, applying the Markov property: 



(1) 



Using spectral properties of the matrix Py , one can show 
that 7Tj(t) reaches for t — > oo a unique stationary state 
7T* obeying the following eigenequation: 



(2) 



For GRW, Pjj = Ay /hi, where fe« = Y^j Aij is the num- 
ber of neighbors of node i (node degree). This means 
that the particle hops to an adjacent node with the same 
probability for all neighbors. The stationary distribution 
of GRW reads 



(3) 



Another quantity of interest, especially important from 
the point of view of RW entropy, is the probability 
P(7i'] t ) of generating a trajectory 7?*] of length f, pass- 
ing through nodes 
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In general P(7;*l ) depends on all nodes on the trajectory. 
For GRW we have 
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and we see that the trajectories are not equiprobable. 
An exception is GRW on a fc-regular graph, whose nodes 
have identical degrees, as for instance on a regular lat- 
tice. In general, however, trajectories produced by GRW 
are not maximally random. As we will see below, there 
exists, though, a choice of Py such that all trajectories 
of given length t and given endpoints are equiprobable. 
This choice corresponds to MERW. 

Let us now present the explicit construction of MERW. 
Let if>i be the normalized eigenvector, J^i = 1j corre- 
sponding to the maximal eigenvalue A of the adjacency 
matrix A,-: 



(6) 



The eigenvalue A is clearly in the range k min < A < fc max , 
where k mm and fc max are the maximal and minimal node 
degrees of the graph, respectively. The Frobenius-Perron 
theorem tells us that the eigenvector has all elements of 
the same sign, so that one can choose ipi > 0. Let us use 
this eigenvector to define the following transition matrix: 



Pi, 



A ipi ' 



(7) 



By construction, the entries P^ are positive if i and j are 
neighboring nodes. They are also properly normalized: 
J2j Pij — 1 ■ A similar construction has been recently 
proposed in the context of optimal information coding [7j. 
The weight ((!]) is now independent of intermediate nodes: 



P( 7 (t) ) = 



A* Vio 



(8) 



and thus all trajectories having length t and given end- 
points io and it are equiprobable. For a closed trajectory, 
the probability |(8]) depends only on its length t. The sta- 
tionary distribution of MERW is 



(9) 



which is easy to check by combining Eqs. (|7j) and (J2]). It 
is a normalized probability: J2i n i — 1> an d the detailed 
balance condition is fulfilled: 7r?Py = ir*Pji. 

We intuitively see that random trajectories generated 
by the transition probabilities (O are more random than 
those generated by GRW since now the probability of 
a given random path ((8]) is independent of intermedi- 
ate nodes. This statement can be quantified by com- 
paring the entropy rates of the corresponding Markov 
processes. Let P(io, i\, . ■ ■ ,it) be the probability of a se- 
quence (io, i\, . . . , it) in the set of all sequences of length t 
generated by the Markov chain. The Shannon entropy in 
this set of sequences is: 



^2 P(i , ■ ■ ■ , i t ) In P(i , ■ ■ ■ ,i t )- 



(10) 



One can show [8j], using the Markov property of the chain: 



P(i ,ii 
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■ Pi t _ ± i t , that for large t the 



entropy S t increases at a fixed rate 
5', 
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(ii) 



which is independent of the initial distribution m. For 
GRW, with P^ = Aij/ki and tt* from Eq. J3]), we obtain 
the entropy production rate 



sqrw 



J2i k% In h 



(12) 



while MERW, with transition rates (jTj) and the stationary 
distribution ([9]), yields smerw = In A. We now show that 
smerw is indeed the maximal entropy rate which can be 
obtained for any stochastic process generating trajecto- 
ries on the graph. The number of trajectories of length 
t on the graph is N t = J2i where A* is the t-th 

power of the adjacency matrix. In the t — > oo limit we 
obtain the following asymptotic value: 



smax = lim 

t^oc 
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(13) 



which sets the upper limit for the entropy rate of such 
processes. We see that smerw = smax, so that MERW 
indeed maximizes the entropy and the corresponding tra- 
jectories are maximally random. As a byproduct we ob- 
tain a lower bound for the largest eigenvalue of the adja- 
cency matrix: 



In A > 



J2i k i ln fc » 



(14) 



which we have not found in the literature. For a fc-regular 
graph, sgrw = smerw = Infc. Similarly, for a bipartite 
graph which has nodes of degree fc in one partition and of 
degree fc' in the other one, sgrw = smerw = \ ln(fcfc'). 

As already mentioned, GRW and MERW are identical 
on a fc-regular graph. For example, GRW on a square 
lattice is maximally random. The question arises how 
much the two types of random walk differ on a graph or 
lattice with some irregularities. For defimteness, imagine 
that we remove at random a small fraction q <C 1 of non- 
adjacent links from an L x L square lattice with periodic 
boundary conditions. In this way we obtain a lattice with 
a weak disorder (dilution) , where most of the nodes are 
of degree fc = 4 and some of degree k — 3. The stationary 
distribution tt* for GRW is given by Eq. ([3]), so that the 
probability of finding the particle after long time at a de- 
fective node is equal to 3 /4 of the probability at an intact 
one. The situation looks completely different for MERW, 
as shown in Fig. [H presenting density plots of tt* for dif- 
ferent densities of defects, obtained by diagonalizing A 
numerically and using Eq. ([9]). For a very low density q 
of defects, the probability tt* is smaller in the neighbor- 
hood of defects, like in the GRW case. However, if the 
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FIG. 1: Density plots of n* for a 40 x 40 square lattice 
with periodic boundary conditions, for the fractions q = 
0.001, 0.01, 0.05, 0.1 of removed links. The nodes incident with 
removed links are marked with circles. Data are obtained by 
an exact diagonalization of the adjacency matrix. 




FIG. 2: Top: a ladder with randomly removed rungs. Bottom: 
stationary distributions 7T* on the ladder for L — 500 and 
various densities of defects q = 0.01,0.1,0.2. Positions of 
defects are marked with vertical lines. 



number of defects increases, the stationary distribution 
7T* becomes localized in a nearly circular region. We will 
indeed argue, using the Lifshitz argument [5j], that this 
localization phenomenon is observed for any finite frac- 
tion of defects provided the linear size L of the system 
is large enough, and that the radius of the localization 
region grows as (InL) 1 / 2 . 

Let us start with a Id example, in order to build 
up some intuition. One cannot of course use a one- 



dimensional chain, since removing a single link would 
disconnect it. Instead, we shall consider, as a model ex- 
ample, a ladder graph with periodic boundary conditions, 
with a fraction q of randomly removed rungs, as shown 
in Fig. [H In order to define the transition probabilities 
((7]) we have to solve the eigenproblem of the adjacency 
matrix A. Let L be length of the ladder. Taking into 
account the symmetry between both legs, we have: 



ipi+i + tf>i-i + nipi = At/>», 



(15) 



where the index i runs over the L nodes in the lower 
leg of the ladder, say, and 7"j = 1 if there is a rung at the 
position i, and r, = otherwise. Introducing the discrete 



Laplacian Ay = 
recast as 



i — 2Sij, Eq. lfl5|) can be 



(16) 



where E = 3 — A, whereas Vi — 1 — form a random 
binary sequence with a frequency of unities or defects 
(vi = 1) equal to q and a frequency of zeros (vi = 0) 
equal to p = 1 — q. Each sequence of sites without de- 
fects (vi = 0) is said to form a well. Eq. lfl6|) is formally 
identical to the eigenvalue equation of the following trap- 
ping problem. A particle performs a random walk in 
continuous time on the Id lattice. Defects act as static 
traps: whenever the particle sits at site i, it is annihi- 
lated at rate Vi per unit time. Trapping problems of this 
kind have been studied extensively (§]. The asymptotic 
long-time fall-off of the survival probability is known to 
be related to the so-called Lifshitz tail in the density of 
states of Eq. lfl6|) as E — > 0. In the present context, 
the Lifshitz argument [H] predicts that the ground state 
of Eq. lfl6|) is well approximated by that of the longest 
well, i.e., —(Aip)i = E^ipi (i = l,...,iu), with Dirich- 
let boundary conditions ipo = ^tu+i = 0, where w is the 
length of that well. We obtain ipi ~ sin(i7r/ (w + 1)) and 
E = 2(1 — cos7r/(ui + 1)) « tt 2 /w 2 . In the Id situa- 
tion [13], this argument is known to essentially give an 
exact description of the ground-state. 

In the case of MERW, we therefore predict that the 
whole stationary probability is asymptotically localized 
on the longest well, i.e., the longest sequence without de- 
fects. The Lifshitz picture is indeed a good approxima- 
tion, as one can see in Fig. [21 showing the density 7r*, ob- 
tained by numerical diagonalization of A. The length w 
of the longest well can be estimated as follows. The mean 
number of unities in the sequence grows as Lq. The mean 
number of those followed by one zero is Lqp, by two ze- 
ros is Lqp 2 , and so on, so that there are Lqp n wells of 
length n, i.e., consisting of n zeros. The length of the 
longest well is then given by Lqp w ~ 1. Hence it grows 
logarithmically with the system size, w « hiL/| lnp|, so 
that Eq ps (tt| ln.p|/ lni) 2 . In Fig. [3] we show that the 
ground-state energy Eq obtained by numerically solving 
Eq. (fT6|) . averaged over binary disorder for q — 0.1, agrees 
with the above estimate for L large enough. 
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FIG. 3: Ground-state energy Eq of Eq. lllCIl on ladders versus 
In L for L = 20, ... , 960 and q = 0.1. The solid line shows 
the estimate E^ 1 ^ 2 — lnL/(7r| lnp|) +B, with B fitted to the 
rightmost data point. 



The Lifshitz argument can be generalized to higher- 



dimensional lattices [ll|. The ground state of the dis- 
cretized Schrodinger equation ([TBI is localized in the 



largest Lifshitz sphere, defined as the largest nearly spher- 
ical region of the lattice which is free of defects. Taking 
again for definiteness the example of the square lattice, 
the radius i? max of the largest Lifshitz disk and the cor- 
responding ground-state energy Eg can be evaluated as 
follows. The number of circular regions of radius R with 
no defects is of order L 2 p 27rR , as there are two links per 
node, so that i? m ax ~ (In L/(n\ lap])) 1 / 2 . In the con- 
tinuum limit, the ground state in the disk of radius R 
is given by ip(r) ~ Jo(jr/R), where r is the distance 
from the center and j « 2.405 is the first zero of the 
Bessel function J . We thus obtain E n (j/i?max) 2 ~ 
7rj 2 |lnp|/lnL. In higher dimension, skipping constants, 
the above estimates read i? max ~ (lnL/| lnpj) 1 ^ and 
Eq ~ (| \np\/ In L) 2 / d . Hence the stationary probability 
of MERW on a d-dimensional lattice in the presence of 
any amount of disorder is localized in the largest Lifshitz 
sphere, whose volume grows asymptotically as InL. 

The above picture allows one to address dynamical is- 
sues. Imagine a random walker starting at a random 
site. In the course of evolution it will find a moderately 
large region free of defects, a sort of local Lifshitz sphere, 
and spend some time there before it will make an ex- 
cursion to another, larger local Lifshitz sphere, etc. The 
process will look very much like going through consec- 
utive metastable states before finally reaching the true 
ground state. Metastable states are formed not because 
of energy barriers, but because of entropy barriers [12], 
as MERW favors regions where it can maximize entropy. 
It is therefore tempting to consider MERW as a model 
of evolution in a flat fitness landscape. 

Let us close up with a comment on the connection with 
the path- integral formalism In the simplest case of 
a free particle propagating in curved space-time from a 
to b, the quantum amplitude is 



where the Euclidean action Se is proportional to time t. 
One would naively expect that all the trajectories 7^ 
should be equiprobable. We know, however, that the 
propagator K a b of a massless scalar field is equal to the 
inverse of the graph Laplacian A a & — ^/kj%8 a b — A-ab 
which can be expressed as a sum over GRW and not 
MERW trajectories. It would be interesting to check to 
what extent the continuum theory would differ if one 
constructed quantum amplitudes using MERW instead 
of GRW trajectories. 

In conclusion, we have shown that GRW maximizes 
local entropy and MERW maximizes global entropy of 
random trajectories. This little change in the definition 
of random walk leads to a dramatic change in the statis- 
tical properties of the system in the presence of a weak 
disorder. 
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